AITopics | Barinas State

Collaborating Authors

Barinas State

Applications and Advances of Artificial Intelligence in Music Generation:A Review

arXiv.org Artificial IntelligenceSep-3-2024

In recent years, artificial intelligence (AI) has made significant progress in the field of music generation, driving innovation in music creation and applications. This paper provides a systematic review of the latest research advancements in AI music generation, covering key technologies, models, datasets, evaluation methods, and their practical applications across various fields. The main contributions of this review include: (1) presenting a comprehensive summary framework that systematically categorizes and compares different technological approaches, including symbolic generation, audio generation, and hybrid models, helping readers better understand the full spectrum of technologies in the field; (2) offering an extensive survey of current literature, covering emerging topics such as multimodal datasets and emotion expression evaluation, providing a broad reference for related research; (3) conducting a detailed analysis of the practical impact of AI music generation in various application domains, particularly in real-time interaction and interdisciplinary applications, offering new perspectives and insights; (4) summarizing the existing challenges and limitations of music quality evaluation methods and proposing potential future research directions, aiming to promote the standardization and broader adoption of evaluation techniques. Through these innovative summaries and analyses, this paper serves as a comprehensive reference tool for researchers and practitioners in AI music generation, while also outlining future directions for the field.

dataset, music, music generation, (14 more...)

arXiv.org Artificial Intelligence

2409.03715

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
South America > Venezuela > Barinas State > Barinas (0.04)
North America > United States > New York (0.04)
(17 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

Add feedback

Colour and Brush Stroke Pattern Recognition in Abstract Art using Modified Deep Convolutional Generative Adversarial Networks

Srinivasan, Srinitish, Pathak, Varenya

arXiv.org Artificial IntelligenceMar-27-2024

Abstract Art is an immensely popular, discussed form of art that often has the ability to depict the emotions of an artist. Many researchers have made attempts to study abstract art in the form of edge detection, brush stroke and emotion recognition algorithms using machine and deep learning. This papers describes the study of a wide distribution of abstract paintings using Generative Adversarial Neural Networks(GAN). GANs have the ability to learn and reproduce a distribution enabling researchers and scientists to effectively explore and study the generated image space. However, the challenge lies in developing an efficient GAN architecture that overcomes common training pitfalls. This paper addresses this challenge by introducing a modified-DCGAN (mDCGAN) specifically designed for high-quality artwork generation. The approach involves a thorough exploration of the modifications made, delving into the intricate workings of DCGANs, optimisation techniques, and regularisation methods aimed at improving stability and realism in art generation enabling effective study of generated patterns. The proposed mDCGAN incorporates meticulous adjustments in layer configurations and architectural choices, offering tailored solutions to the unique demands of art generation while effectively combating issues like mode collapse and gradient vanishing. Further this paper explores the generated latent space by performing random walks to understand vector relationships between brush strokes and colours in the abstract art space and a statistical analysis of unstable outputs after a certain period of GAN training and compare its significant difference. These findings validate the effectiveness of the proposed approach, emphasising its potential to revolutionise the field of digital art generation and digital art ecosystem.

discriminator, generative adversarial network, generator, (14 more...)

arXiv.org Artificial Intelligence

2403.18397

Country:

South America > Venezuela > Barinas State > Barinas (0.04)
North America > United States > Illinois (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Neural-Base Music Generation for Intelligence Duplication

Galajda, Jacob, Hua, Kien

arXiv.org Artificial IntelligenceOct-20-2023

There are two aspects of machine learning and artificial intelligence: (1) interpreting information, and (2) inventing new useful information. Much advance has been made for (1) with a focus on pattern recognition techniques (e.g., interpreting visual data). This paper focuses on (2) with intelligent duplication (ID) for invention. We explore the possibility of learning a specific individual's creative reasoning in order to leverage the learned expertise and talent to invent new information. More specifically, we employ a deep learning system to learn from the great composer Beethoven and capture his composition ability in a hash-based knowledge base. This new form of knowledge base provides a reasoning facility to drive the music composition through a novel music generation method.

composer, neural base, neural network, (16 more...)

arXiv.org Artificial Intelligence

2310.13691

Country:

South America > Venezuela > Barinas State > Barinas (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Florida > Orange County > Orlando (0.04)

Genre:

Research Report (0.64)
Questionnaire & Opinion Survey (0.46)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MSight: An Edge-Cloud Infrastructure-based Perception System for Connected Automated Vehicles

Zhang, Rusheng, Meng, Depu, Shen, Shengyin, Zou, Zhengxia, Li, Houqiang, Liu, Henry X.

arXiv.org Artificial IntelligenceOct-8-2023

As vehicular communication and networking technologies continue to advance, infrastructure-based roadside perception emerges as a pivotal tool for connected automated vehicle (CAV) applications. Due to their elevated positioning, roadside sensors, including cameras and lidars, often enjoy unobstructed views with diminished object occlusion. This provides them a distinct advantage over onboard perception, enabling more robust and accurate detection of road objects. This paper presents MSight, a cutting-edge roadside perception system specifically designed for CAVs. MSight offers real-time vehicle detection, localization, tracking, and short-term trajectory prediction. Evaluations underscore the system's capability to uphold lane-level accuracy with minimal latency, revealing a range of potential applications to enhance CAV safety and efficiency. Presently, MSight operates 24/7 at a two-lane roundabout in the City of Ann Arbor, Michigan.

detection, perception system, university, (15 more...)

arXiv.org Artificial Intelligence

2310.0529

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.49)
Asia > China > Beijing > Beijing (0.04)
Asia > China > Anhui Province > Hefei (0.04)
(5 more...)

Genre: Personal (0.93)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks (1.00)
Transportation > Infrastructure & Services (0.90)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Multiscale Attention via Wavelet Neural Operators for Vision Transformers

Nekoozadeh, Anahita, Ahmadzadeh, Mohammad Reza, Mardani, Zahra

arXiv.org Artificial IntelligenceAug-15-2023

Transformers have achieved widespread success in computer vision. At their heart, there is a Self-Attention (SA) mechanism, an inductive bias that associates each token in the input with every other token through a weighted basis. The standard SA mechanism has quadratic complexity with the sequence length, which impedes its utility to long sequences appearing in high resolution vision. Recently, inspired by operator learning for PDEs, Adaptive Fourier Neural Operators (AFNO) were introduced for high resolution attention based on global convolution that is efficiently implemented via FFT. However, the AFNO global filtering cannot well represent small and moderate scale structures that commonly appear in natural images. To leverage the coarse-to-fine scale structures we introduce a Multiscale Wavelet Attention (MWA) by leveraging wavelet neural operators which incurs linear complexity in the sequence size. We replace the attention in ViT with MWA and our experiments with CIFAR and Tiny-ImageNet classification demonstrate significant improvement over alternative Fourier-based attentions such as AFNO and Global Filter Network (GFN).

data quality, experiment, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2303.12398

Country:

Asia > Middle East > Iran > Isfahan Province > Isfahan (0.04)
South America > Venezuela > Barinas State > Barinas (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Data Science > Data Quality > Data Transformation (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Priors in Deep Image Restoration and Enhancement: A Survey

Lu, Yunfan, Lin, Yiqi, Wu, Hao, Luo, Yunhao, Zheng, Xu, Xiong, Hui, Wang, Lin

arXiv.org Artificial IntelligenceJul-25-2023

Image restoration and enhancement is a process of improving the image quality by removing degradations, such as noise, blur, and resolution degradation. Deep learning (DL) has recently been applied to image restoration and enhancement. Due to its ill-posed property, plenty of works have been explored priors to facilitate training deep neural networks (DNNs). However, the importance of priors has not been systematically studied and analyzed by far in the research community. Therefore, this paper serves as the first study that provides a comprehensive overview of recent advancements in priors for deep image restoration and enhancement. Our work covers five primary contents: (1) A theoretical analysis of priors for deep image restoration and enhancement; (2) A hierarchical and structural taxonomy of priors commonly used in the DL-based methods; (3) An insightful discussion on each prior regarding its principle, potential, and applications; (4) A summary of crucial problems by highlighting the potential future directions, especially adopting the large-scale foundation models as prior, to spark more research in the community; (5) An open-source repository that provides a taxonomy of all mentioned works and code links.

artificial intelligence, image restoration, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2206.0207

Country:

North America > United States (0.14)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)
(3 more...)

Genre: Overview (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Convolutional Neural Network (CNN) to reduce construction loss in JPEG compression caused by Discrete Fourier Transform (DFT)

Kunwar, Suman

arXiv.org Artificial IntelligenceJul-2-2023

In recent decades, digital image processing has gained enormous popularity. Consequently, a number of data compression strategies have been put forth, with the goal of minimizing the amount of information required to represent images. Among them, JPEG compression is one of the most popular methods that has been widely applied in multimedia and digital applications. The periodic nature of DFT makes it impossible to meet the periodic condition of an image's opposing edges without producing severe artifacts, which lowers the image's perceptual visual quality. On the other hand, deep learning has recently achieved outstanding results for applications like speech recognition, image reduction, and natural language processing. Convolutional Neural Networks (CNN) have received more attention than most other types of deep neural networks. The use of convolution in feature extraction results in a less redundant feature map and a smaller dataset, both of which are crucial for image compression. In this work, an effective image compression method is purposed using autoencoders. The study's findings revealed a number of important trends that suggested better reconstruction along with good compression can be achieved using autoencoders.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2209.03475

Country:

South America > Venezuela > Barinas State > Barinas (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)
Europe > Italy (0.04)

Genre: Research Report (0.53)

Industry: Media > Photography (0.35)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The First Principles of Deep Learning and Compression

Ehrlich, Max

arXiv.org Machine LearningApr-4-2022

The deep learning revolution incited by the 2012 Alexnet paper has been transformative for the field of computer vision. Many problems which were severely limited using classical solutions are now seeing unprecedented success. The rapid proliferation of deep learning methods has led to a sharp increase in their use in consumer and embedded applications. One consequence of consumer and embedded applications is lossy multimedia compression which is required to engineer the efficient storage and transmission of data in these real-world scenarios. As such, there has been increased interest in a deep learning solution for multimedia compression which would allow for higher compression ratios and increased visual quality. The deep learning approach to multimedia compression, so called Learned Multimedia Compression, involves computing a compressed representation of an image or video using a deep network for the encoder and the decoder. While these techniques have enjoyed impressive academic success, their industry adoption has been essentially non-existent. Classical compression techniques like JPEG and MPEG are too entrenched in modern computing to be easily replaced. This dissertation takes an orthogonal approach and leverages deep learning to improve the compression fidelity of these classical algorithms. This allows the incredible advances in deep learning to be used for multimedia compression without threatening the ubiquity of the classical methods. The key insight of this work is that methods which are motivated by first principles, i.e., the underlying engineering decisions that were made when the compression algorithms were developed, are more effective than general methods. By encoding prior knowledge into the design of the algorithm, the flexibility, performance, and/or accuracy are improved at the cost of generality...

artificial intelligence, super-resolution generative adversarial network, vision and pattern recognition workshop, (15 more...)

arXiv.org Machine Learning

2204.01782

Country:

Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Summary/Review (0.87)

Industry:

Leisure & Entertainment (1.00)
Information Technology (1.00)
Education (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Blur, Noise, and Compression Robust Generative Adversarial Networks

Kaneko, Takuhiro, Harada, Tatsuya

arXiv.org Machine LearningMar-17-2020

Recently, generative adversarial networks (GANs), which learn data distributions through adversarial training, have gained special attention owing to their high image reproduction ability. However, one limitation of standard GANs is that they recreate training images faithfully despite image degradation characteristics such as blur, noise, and compression. To remedy this, we address the problem of blur, noise, and compression robust image generation. Our objective is to learn a non-degraded image generator directly from degraded images without prior knowledge of image degradation. The recently proposed noise robust GAN (NR-GAN) already provides a solution to the problem of noise degradation. Therefore, we first focus on blur and compression degradations. We propose blur robust GAN (BR-GAN) and compression robust GAN (CR-GAN), which learn a kernel generator and quality factor generator, respectively, with non-degraded image generators. Owing to the irreversible blur and compression characteristics, adjusting their strengths is non-trivial. Therefore, we incorporate switching architectures that can adapt the strengths in a data-driven manner. Based on BR-GAN, NR-GAN, and CR-GAN, we further propose blur, noise, and compression robust GAN (BNCR-GAN), which unifies these three models into a single model with additionally introduced adaptive consistency losses that suppress the uncertainty caused by the combination. We provide benchmark scores through large-scale comparative studies on CIFAR-10 and a generality analysis on FFHQ dataset.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

2003.07849

Country:

South America > Venezuela > Barinas State > Barinas (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

CAS-CNN: A Deep Convolutional Neural Network for Image Compression Artifact Suppression

Cavigelli, Lukas, Hager, Pascal, Benini, Luca

arXiv.org Artificial IntelligenceNov-22-2016

Lossy image compression algorithms are pervasively used to reduce the size of images transmitted over the web and recorded on data storage media. However, we pay for their high compression rate with visual artifacts degrading the user experience. Deep convolutional neural networks have become a widespread tool to address high-level computer vision tasks very successfully. Recently, they have found their way into the areas of low-level computer vision and image processing to solve regression problems mostly with relatively shallow networks. We present a novel 12-layer deep convolutional network for image compression artifact suppression with hierarchical skip connections and a multi-scale loss function. We achieve a boost of up to 1.79 dB in PSNR over ordinary JPEG and an improvement of up to 0.36 dB over the best previous ConvNet result. We show that a network trained for a specific quality factor (QF) is resilient to the QF used to compress the input image - a single network trained for QF 60 provides a PSNR gain of more than 1.5 dB over the wide QF range from 40 to 76.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IJCNN.2017.7965927

1611.07233

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
South America > Venezuela > Barinas State > Barinas (0.04)
North America > United States > Massachusetts > Middlesex County > Natick (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback